Data Migration

Get an overview of AWS Snowball services, AWS DMS, and AWS DataSync.

General#

Snowball is a petabyte-scale data transport solution for transferring data into or out of AWS. It uses a secure storage device for physical transportation. Snowball can import to S3 or export from S3.

AWS Snowball Client is a software that is installed on a local computer and is used to identify, compress, encrypt, and transfer data. It uses 256-bit encryption (managed with the AWS KMS) and tamper-resistant enclosures with TPM.

AWS Snowball

Recommendation

To speed up data transfer, it is recommended to run simultaneous instances of the AWS Snowball Client in multiple terminals and transfer small files as batches.

Watch out!

Snowball must be ordered from and returned to the same region.

The Snowball family#

Several services are offered in the Snowball family. The table below describes these services at a high-level:

Service What it is
AWS Import/Export Ship an external hard drive to AWS. Someone at AWS plugs it in and copies your data to S3.
AWS Snowball Ruggedized NAS in a box that AWS ships to you. You can copy up to 80TB of data and ship it back to AWS. They copy the data over to S3.
AWS Snowball Edge Same as Snowball, but with onboard Lambda and clustering
AWS Snowmobile A literal shipping container full of storage (up to 100PB) and a truck to transport it

Additional details

  • Snowball (80TB) (50TB model available only in the USA)
  • Snowball Edge (100TB): comes with onboard storage and compute capabilities
  • Snowmobile: exabyte scale with up to 100PB per Snowmobile
  • AWS import/export is when you send your own disks into AWS. This is being deprecated in favor of Snowball.
AWS Snowball
AWS Snowmobile

AWS Database Migration Service#

AWS Database Migration Service helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.

The AWS Database Migration Service can migrate your data to and from the most widely used commercial and open-source databases.

AWS Database Migration Service

Supported migration paths include the following:

  • On-premises and EC2 databases to Amazon RDS or Amazon Aurora
  • Homogeneous migrations such as Oracle to Oracle
  • Heterogeneous migrations between different database platforms, such as Oracle or Microsoft or SQL Server to Amazon Aurora

With AWS Database Migration Service, you can continuously replicate your data with high availability and consolidate databases into a petabyte-scale data warehouse by streaming data to Amazon Redshift and Amazon S3. DMS also has replication functions for on-premise to AWS or to Snowball or S3.

When migrating databases to Amazon Aurora, Amazon Redshift, Amazon DynamoDB, or Amazon DocumentDB (with MongoDB compatibility), you can use DMS free for six months. You can also use it with the Schema Conversion Tool (SCT) to migrate databases to AWS RDS or EC2-based databases.

The Schema Conversion Tool can copy database schemas for homogenous migrations (same database) and convert schemas for heterogeneous migrations (different databases). DMS is used for smaller, simpler conversions and also supports MongoDB and DynamoDB, whereas SCT is used for larger, more complex datasets like data warehouses.

AWS Database Migration Service overview

AWS DataSync#

AWS DataSync makes it simple and fast to move large amounts of data online between on-premises storage and Amazon S3 or Amazon Elastic File System (Amazon EFS).

Manual tasks related to data transfers can slow down migrations and burden IT operations. DataSync eliminates or automatically handles many of these tasks, including scripting copy jobs, scheduling and monitoring transfers, validating data, and optimizing network utilization.

AWS DataSync

The DataSync software agent connects to your Network File System (NFS) and Server Message Block (SMB) storage, so you don’t have to modify your applications. DataSync can transfer hundreds of terabytes and millions of files over the Internet or AWS Direct Connect links. The agent transfers data rapidly and deposits it into your designated Amazon S3 bucket or Amazon EFS file system.

You can use DataSync to migrate active data sets or archives to AWS, transfer data to the cloud for timely analysis and processing, or replicate data to AWS for business continuity.

DataSync can copy data between:

  • Network File System (NFS)
  • Server Message Block (SMB) file servers
  • All Amazon Simple Storage Service (Amazon S3) storage classes
  • Amazon Elastic File System (Amazon EFS) file systems.

All data is encrypted in transit with Transport Layer Security (TLS). DataSync supports using default encryption for S3 buckets using Amazon S3-Managed Encryption Keys (SSE-S3) and Amazon EFS file system encryption of data at rest.

Task scheduling enables you to configure executing a task periodically to detect and copy changes from your source storage system to the destination. You can schedule your tasks using the AWS DataSync Console or AWS Command Line Interface (CLI) without writing and running scripts to manage repeated transfers.

When copying data to Amazon S3, DataSync automatically converts each file into a single S3 object in a 1:1 relationship and preserves POSIX metadata as Amazon S3 object metadata. When you copy objects that contain file system metadata back to file formats, the original file metadata that DataSync copied to S3 is restored. Similarly, when Amazon EFS is the destination for your data, DataSync preserves existing directory structures and file metadata. DataSync also supports VPC endpoints (powered by AWS PrivateLink) to move files directly into your Amazon VPC.

Database Quiz

Migration Quiz